Solving Large-Scale and Sparse-Reward DEC-POMDPs with Correlation-MDPs

نویسندگان

Feng Wu

Xiaoping Chen

چکیده

Within a group of cooperating agents the decision making of an individual agent depends on the actions of the other agents. A lot of effort has been made to solve this problem with additional assumptions on the communication abilities of agents. However, in some realworld applications, communication is limited and the assumptions are rarely satisfied. An alternative approach newly developed is to employ a correlation device to correlate the agents’ behavior without exchanging information during execution. In this paper, we apply correlation device to large-scale and spare-reward domains. As a basis we use the framework of infinite-horizon DEC-POMDPs which represent policies as joint stochastic finite-state controllers. To solve any problem of this kind, a correlation device is firstly calculated by solving Correlation Markov Decision Processes (Correlation-MDPs) and then used to improve the local controller for each agent. By using this method, we are able to achieve a tradeoff between computational complexity and the quality of the approximation. In addition, we demonstrate that, adversarial problems can be solved by encoding the information of opponents’ behavior in the correlation device. We have successfully implemented the proposed method into our 2D simulated robot soccer team and the performance in RoboCup-2006 was encouraging.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automated Generation of Interaction Graphs for Value-Factored Dec-POMDPs

The Decentralized Partially Observable Markov Decision Process (Dec-POMDP) is a powerful model for multiagent planning under uncertainty, but its applicability is hindered by its high complexity – solving Dec-POMDPs optimally is NEXP-hard. Recently, Kumar et al. introduced the Value Factorization (VF) framework, which exploits decomposable value functions that can be factored into subfunctions....

متن کامل

Automated Generation of Interaction Graphs for Value-Factored Decentralized POMDPs

متن کامل

Probabilistic Planning with Risk-Sensitive Criterion

Probabilistic planning models and, in particular, Markov Decision Processes (MDPs), Partially Observable Markov Decision Processes (POMDPs) and Decentralized Partially Observable Markov Decision Processes (Dec-POMDPs) have been extensively used by AI and Decision Theoretic communities for planning under uncertainty. Typically, the solvers for probabilistic planning models find policies that min...

متن کامل

Exploiting separability in multiagent planning with continuous-state MDPs

Recent years have seen significant advances in techniques for optimally solving multiagent problems represented as decentralized partially observable Markov decision processes (Dec-POMDPs). A new method achieves scalability gains by converting Dec-POMDPs into continuous state MDPs. This method relies on the assumption of a centralized planning phase that generates a set of decentralized policie...

متن کامل

Filtered Fictitious Play for Perturbed Observation Potential Games and Decentralised POMDPs

Potential games and decentralised partially observable MDPs (Dec–POMDPs) are two commonly used models of multi–agent interaction, for static optimisation and sequential decision– making settings, respectively. In this paper we introduce filtered fictitious play for solving repeated potential games in which each player’s observations of others’ actions are perturbed by random noise, and use this...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2007

Solving Large-Scale and Sparse-Reward DEC-POMDPs with Correlation-MDPs

نویسندگان

چکیده

منابع مشابه

Automated Generation of Interaction Graphs for Value-Factored Dec-POMDPs

Automated Generation of Interaction Graphs for Value-Factored Decentralized POMDPs

Probabilistic Planning with Risk-Sensitive Criterion

Exploiting separability in multiagent planning with continuous-state MDPs

Filtered Fictitious Play for Perturbed Observation Potential Games and Decentralised POMDPs

عنوان ژورنال:

اشتراک گذاری